Search CORE

93 research outputs found

Pathway analysis and transcriptomics improve protein identification by shotgun proteomics from samples comprising small number of cells - a benchmarking study

Author: Brusic Vladimir
Fenyo David
Ivanov Alexander R.
Karger Barry L.
Li Siyang
Lisacek Frederique
Murthy Shashi K.
Sun Jing
Zhang Guang Lan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: Proteomics research is enabled with the high-throughput technologies, but our ability to identify expressed proteome is limited in small samples. The coverage and consistency of proteome expression are critical problems in proteomics. Here, we propose pathway analysis and combination of microproteomics and transcriptomics analyses to improve mass-spectrometry protein identification from small size samples. RESULTS: Multiple proteomics runs using MCF-7 cell line detected 4,957 expressed proteins. About 80% of expressed proteins were present in MCF-7 transcripts data; highly expressed transcripts are more likely to have expressed proteins. Approximately 1,000 proteins were detected in each run of the small sample proteomics. These proteins were mapped to gene symbols and compared with gene sets representing canonical pathways, more than 4,000 genes were extracted from the enriched gene sets. The identified canonical pathways were largely overlapping between individual runs. Of identified pathways 182 were shared between three individual small sample runs. CONCLUSIONS: Current technologies enable us to directly detect 10% of expressed proteomes from small sample comprising as few as 50 cells. We used knowledge-based approaches to elucidate the missing proteome that can be verified by targeted proteomics. This knowledge-based approach includes pathway analysis and combination of gene expression and protein expression data for target prioritization. Genes present in both the enriched gene sets (canonical pathways collection) and in small sample proteomics data correspond to approximately 50% of expressed proteomes in larger sample proteomics data. In addition, 90% of targets from canonical pathways were estimated to be expressed. The comparison of proteomics and transcriptomics data, suggests that highly expressed transcripts have high probability of protein expression. However, approximately 10% of expressed proteins could not be matched with the expressed transcripts.The cost of this publication was funded by Vladimir Brusic. (Vladimir Brusic)Published versio

Crossref

Boston University Institutional Repository (OpenBU)

Springer - Publisher Connector

PubMed Central

Nazarbayev University Repository

swissPIT: a novel approach for pipelined analysis of mass spectrometry data

Author: Appel Ron D.
Hernandez Céline
Hernandez Patricia
Lisacek Frederique
Maffioletti Sergio
Masselot Alexandre
Pautasso Cesare
Quandt Andreas
Publication venue
Publication date: 02/08/2017
Field of study

The identification and characterization of peptides from tandem mass spectrometry (MS/MS) data represents a critical aspect of proteomics. Today, tandem MS analysis is often performed by only using a single identification program achieving identification rates between 10-50% (Elias and Gygi, 2007). Beside the development of new analysis tools, recent publications describe also the pipelining of different search programs to increase the identification rate (Hartler et al., 2007; Keller et al., 2005). The Swiss Protein Identification Toolbox (swissPIT) follows this approach, but goes a step further by providing the user an expandable multi-tool platform capable of executing workflows to analyze tandem MS-based data. One of the major problems in proteomics is the absent of standardized workflows to analyze the produced data. This includes the pre-processing part as well as the final identification of peptides and proteins. The main idea of swissPIT is not only the usage of different identification tool in parallel, but also the meaningful concatenation of different identification strategies at the same time. The swissPIT is open source software but we also provide a user-friendly web platform, which demonstrates the capabilities of our software and which is available at http://swisspit.cscs.ch upon request for account. Contact: [email protected]

RERO DOC Digital Library

UniCarbKB: building a knowledge platform for glycoproteomics

Author: Akune Yukie
Aoki-Kinoshita Kiyoko F.
Campbell Matthew P.
Gasteiger Elisabeth
Lisacek Frederique
Mariethoz Julien
Packer Nicolle H.
Peterson Robyn
Publication venue
Publication date: 02/08/2017
Field of study

The UniCarb KnowledgeBase (UniCarbKB; http://unicarbkb.org) offers public access to a growing, curated database of information on the glycan structures of glycoproteins. UniCarbKB is an international effort that aims to further our understanding of structures, pathways and networks involved in glycosylation and glyco-mediated processes by integrating structural, experimental and functional glycoscience information. This initiative builds upon the success of the glycan structure database GlycoSuiteDB, together with the informatic standards introduced by EUROCarbDB, to provide a high-quality and updated resource to support glycomics and glycoproteomics research. UniCarbKB provides comprehensive information concerning glycan structures, and published glycoprotein information including global and site-specific attachment information. For the first release over 890 references, 3740 glycan structure entries and 400 glycoproteins have been curated. Further, 598 protein glycosylation sites have been annotated with experimentally confirmed glycan structures from the literature. Among these are 35 glycoproteins, 502 structures and 60 publications previously not included in GlycoSuiteDB. This article provides an update on the transformation of GlycoSuiteDB (featured in previous NAR Database issues and hosted by ExPASy since 2009) to UniCarbKB and its integration with UniProtKB and GlycoMod. Here, we introduce a refactored database, supported by substantial new curated data collections and intuitive user-interfaces that improve database searchin

RERO DOC Digital Library

GlycoDigest: a tool for the targeted use of exoglycosidase digestions in glycan structure determination

Author: Abrahams Jodie L.
Campbell Matthew P.
Gotz Lou
Karlsson Niclas G.
Lisacek Frederique
Mariethoz Julien
Packer Nicolle H.
Rudd Pauline M.
Publication venue
Publication date: 02/08/2017
Field of study

Summary: Sequencing oligosaccharides by exoglycosidases, either sequentially or in an array format, is a powerful tool to unambiguously determine the structure of complex N- and O-link glycans. Here, we introduce GlycoDigest, a tool that simulates exoglycosidase digestion, based on controlled rules acquired from expert knowledge and experimental evidence available in GlycoBase. The tool allows the targeted design of glycosidase enzyme mixtures by allowing researchers to model the action of exoglycosidases, thereby validating and improving the efficiency and accuracy of glycan analysis. Availability and implementation: http://www.glycodigest.org. Contact: [email protected] or [email protected]

RERO DOC Digital Library

Clustering and Filtering Tandem Mass Spectra Acquired in Data-Independent Mode

Author: Gluck Florent
Lisacek Frederique
Muller Markus
Nikitin Frederic
Pak Huisong
Scherl Alexander
Publication venue
Publication date: 18/06/2018
Field of study

Data-independent mass spectrometry activates all ion species isolated within a given mass-to-charge window (m/z) regardless of their abundance. This acquisition strategy overcomes the traditional data-dependent ion selection boosting data reproducibility and sensitivity. However, several tandem mass (MS/MS) spectra of the same precursor ion are acquired during chromatographic elution resulting in large data redundancy. Also, the significant number of chimeric spectra and the absence of accurate precursor ion masses hamper peptide identification. Here, we describe an algorithm to preprocess data-independent MS/MS spectra by filtering out noise peaks and clustering the spectra according to both the chromatographic elution profiles and the spectral similarity. In addition, we developed an approach to estimate the m/z value of precursor ions from clustered MS/MS spectra in order to improve database search performance. Data acquired using a small 3 m/z units precursor mass window and multiple injections to cover a m/z range of 400-1400 was processed with our algorithm. It showed an improvement in the number of both peptide and protein identifications by 8% while reducing the number of submitted spectra by 18% and the number of peaks by 55%. We conclude that our clustering method is a valid approach for data analysis of these data-independent fragmentation spectra. The software including the source code is available for the scientific community. Figure

RERO DOC Digital Library

UniCarb-DB: a database resource for glycomic discovery

Author: Campbell Matthew P.
Hayes Catherine A.
Karlsson Niclas G.
Lisacek Frederique
Packer Nicolle H.
Rudd Pauline M.
Struwe Weston B.
Publication venue
Publication date: 02/08/2017
Field of study

Summary: Glycosylation is one of the most important post-translational modifications of proteins, known to be involved in pathogen recognition, innate immune response and protection of epithelial membranes. However, when compared to the tools and databases available for the processing of high-throughput proteomic data, the glycomic domain is severely lacking. While tools to assist the analysis of mass spectrometry (MS) and HPLC are continuously improving, there are few resources available to support liquid chromatography (LC)-MS/MS techniques for glycan structure profiling. Here, we present a platform for presenting oligosaccharide structures and fragment data characterized by LC-MS/MS strategies. The database is annotated with high-quality datasets and is designed to extend and reinforce those standards and ontologies developed by existing glycomics databases. Availability: http://www.unicarb-db.org Contact: [email protected]

RERO DOC Digital Library

Compression of Structured High-Throughput Sequencing Data

Author: ER Mardis
Fabien Campagne
Frederique Lisacek
H Li
H Li
James T. Robinson
Jill P. Mesirov
JK Pickrell
JR Shearstone
JT Robinson
Kevin C. Dorff
L Skrabanek
M Hsi-Yang Fritz
M Mangone
N Agrawal
N Popitsch
Nyasha Chambwe
SM Kielbasa
TD Wu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 28/11/2012
Field of study

Large biological datasets are being produced at a rapid pace and create substantial storage challenges, particularly in the domain of high-throughput sequencing (HTS). Most approaches currently used to store HTS data are either unable to quickly adapt to the requirements of new sequencing or analysis methods (because they do not support schema evolution), or fail to provide state of the art compression of the datasets. We have devised new approaches to store HTS data that support seamless data schema evolution and compress datasets substantially better than existing approaches. Building on these new approaches, we discuss and demonstrate how a multi-tier data organization can dramatically reduce the storage, computational and network burden of collecting, analyzing, and archiving large sequencing datasets. For instance, we show that spliced RNA-Seq alignments can be stored in less than 4% the size of a BAM file with perfect data fidelity. Compared to the previous compression state of the art, these methods reduce dataset size more than 40% when storing exome, gene expression or DNA methylation datasets. The approaches have been integrated in a comprehensive suite of software tools (http://goby.campagnelab.org) that support common analyses for a range of high-throughput sequencing assays.National Center for Research Resources (U.S.) (Grant UL1 RR024996)Leukemia & Lymphoma Society of America (Translational Research Program Grant LLS 6304-11)National Institute of Mental Health (U.S.) (R01 MH086883

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

False discovery rate estimation and heterobifunctional cross-linkers

Author: A Belsom
A Belsom
A Kao
A Leitner
A Leitner
A Maiolica
A Sinz
AN Holding
B Yang
C Cretu
Frederique Lisacek
J Peng
J Rappsilber
JE Elias
JP Erzberger
Juri Rappsilber
L Fischer
Lutz Fischer
MA Abad
NI Brodie
P Singh
RE Moore
RG Efremov
S Herbst
S Sanowar
SH Giese
T Legal
T Walzthoeni
T Walzthoeni
W Haas
X Wu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

<div><p>False discovery rate (FDR) estimation is a cornerstone of proteomics that has recently been adapted to cross-linking/mass spectrometry. Here we demonstrate that heterobifunctional cross-linkers, while theoretically different from homobifunctional cross-linkers, need not be considered separately in practice. We develop and then evaluate the impact of applying a correct FDR formula for use of heterobifunctional cross-linkers and conclude that there are minimal practical advantages. Hence a single formula can be applied to data generated from the many different non-cleavable cross-linkers.</p></div

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

FigShare

A community proposal to integrate proteomics activities in ELIXIR

Computational approaches have been major drivers behind the progress of proteomics in recent years. The aim of this white paper is to provide a framework for integrating computational proteomics into ELIXIR in the near future, and thus to broaden the portfolio of omics technologies supported by this European distributed infrastructure. This white paper is the direct result of a strategy meeting on ‘The Future of Proteomics in ELIXIR’ that took place in March 2017 in Tübingen (Germany), and involved representatives of eleven ELIXIR nodes. These discussions led to a list of priority areas in computational proteomics that would complement existing activities and close gaps in the portfolio of tools and services offered by ELIXIR so far. We provide some suggestions on how these activities could be integrated into ELIXIR’s existing platforms, and how it could lead to a new ELIXIR use case in proteomics. We also highlight connections to the related field of metabolomics, where similar activities are ongoing. This white paper could thus serve as a starting point for the integration of computational proteomics into ELIXIR. Over the next few months we will be working closely with all stakeholders involved, and in particular with other representatives of the proteomics community, to further refine this paper

Hal - Université Grenoble Alpes

University of Groningen

ZENODO

Directory of Open Access Journals

Proceedings - University of Groningen

Lund University Publications

Crossref

ARTS repository - University of Groningen

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

University of Southern Denmark Research Output

Utrecht University Repository

Dissertations of the University of Groningen

Large-scale mapping of bioactive peptides in structural and sequence space

Author: A Iwaniak
A Magner
AA Karelin
AFG Cicero
Agustina E. Nardo
C a. Orengo
C Mooney
C Zioudrou
CA Orengo
CA Orengo
CC Udenigwe
CC Udenigwe
CC Udenigwe
D Bouglé
DG Higgins
DM Martirosyan
F Shahidi
Frederique Lisacek
G Caetano-Anollés
G Wang
GA Reeves
Gustavo Parisi
H Daniel
H Korhonen
H Meisel
I Ladunga
J Dziuba
J Dziuba
J Soding
K Drew
L Jaroszewski
LA Dave
M Ashburner
M Yoshikawa
M. Cristina Añón
N Arroume
N Furnham
NP Möller
P Minkiewicz
P Minkiewicz
R He
RJS de Castro
SF Altschul
T Shtatland
TL Pownall
VV Pak
W Wang
WM Miner-Williams
Y Huang
Y-W Li
YI Wolf
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Health-enhancing potential bioactive peptide (BP) has driven an interest in food proteins as well as in the development of predictive methods. Research in this area has been especially active to use them as components in functional foods. Apparently, BPs do not have a given biological function in the containing proteins and they do not evolve under independent evolutionary constraints. In this work we performed a large-scale mapping of BPs in sequence and structural space. Using well curated BP deposited in BIOPEP database, we searched for exact matches in non-redundant sequences databases. Proteins containing BPs, were used in fold-recognition methods to predict the corresponding folds and BPs occurrences were mapped. We found that fold distribution of BP occurrences possibly reflects sequence relative abundance in databases. However, we also found that proteins with 5 or more than 5 BP in their sequences correspond to well populated protein folds, called superfolds. Also, we found that in well populated superfamilies, BPs tend to adopt similar locations in the protein fold, suggesting the existence of hotspots. We think that our results could contribute to the development of new bioinformatics pipeline to improve BP detection.Fil: Nardo, Agustina Estefania. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Universidad Nacional de la Plata. Facultad de Ciencias Exactas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos; ArgentinaFil: Añon, Maria Cristina. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Universidad Nacional de la Plata. Facultad de Ciencias Exactas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos; ArgentinaFil: Parisi, Gustavo Daniel. Universidad Nacional de Quilmes; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Directory of Open Access Journals

FigShare